Back

Genome Biology

Springer Science and Business Media LLC

Preprints posted in the last 30 days, ranked by how well they match Genome Biology's content profile, based on 14 papers previously published here. The average preprint has a 0.05% match score for this journal, so anything above that is already an above-average fit.

1
Distinguishing causal from tagging enhancers using single-cell multiome data

Dorans, E.; Price, A. L.

2026-02-17 genetic and genomic medicine 10.64898/2026.02.15.26346353
#1
107× avg
Show abstract

Methods that analyze single-cell RNA-seq+ATAC-seq multiome data have shown promise in linking enhancers to target genes by correlating chromatin accessibility with gene expression across cells. However, correlations among ATAC-seq peaks may induce non-causal tagging peak-gene links (analogous to tagging associations in GWAS); indeed, we confirm that tagging effects induced by peak co-accessibility are pervasive in peak-gene linking. We defined two scores for each ATAC-seq peak: co-accessibility score, the sum of squared correlations with each nearby peak; and co-activity score, the sum of squared correlations with each nearby gene. We compared these scores in 4 multiome data sets (spanning 86k cells and 6 immune/blood cell types) and determined that co-accessibility score and co-activity score were strongly correlated across peaks (r = 0.57-0.73); these correlations were not explained by read depth, cell subtypes, or measurement noise, but are consistent with tagging. Indeed, non-causal peak-gene correlations were strongly correlated to a peaks tagging correlation with a causal peak in CRISPRi data (r = 0.92). We further determined that causal peak-gene associations are concentrated in specific functional categories of peaks, by regressing co-activity scores on stratified co-accessibility scores (S-CASC): e.g. 2.91x (s.e. 0.67) enrichment for peaks closest to a genes TSS and 1.41x (s.e. 0.11) enrichment for peaks overlapping H3K27ac marks. Co-accessibility scores were substantially driven by the number of transcription factor binding sites (TFBS) within a peak, and peak-peak correlations were substantially driven by the number of TFBS pairs within the two peaks with a shared TF. These effects were concentrated in a small number of pioneer TFs, which activate repressed chromatin regions. Consistent with widespread tagging, peak-gene links that we fine-mapped using SuSiE significantly outperformed marginal peak-gene links in evaluation sets derived from CRISPRi and eQTL data. We provide examples demonstrating the impact of tagging effects at specific peaks and genes implicated in GWAS of blood cell traits. Our findings underscore the importance of accounting for tagging effects when linking enhancers to target genes.

2
Airborne particulate matter enhances with monosodium urate crystals the secretion of IL-1b by human immune cells

Razazan, A.; Merriman, M.; Burden, N.; Reynolds, R.; Joosten, L. A.; Hussain, S.; Merriman, T.

2026-03-02 rheumatology 10.64898/2026.02.26.26347218
#1
99× avg
Show abstract

Gout is driven by an interleukin-1{beta}-mediated intense innate immune reaction to monosodium urate (MSU) crystals (MSUc). In cell culture models of inflammatory gout there is a synergistic effect of phagocytosis of MSUc and TLR2 and TLR4 activation by agonists such as free fatty acid and lipopolysaccharide (LPS) in NLRP3-inflammasome activation and IL-1{beta} secretion. A substantial number of gout patients do not report a dietary trigger, and observational studies associate airborne particulate matter with incident gout and flares. Airborne particulate matter contains LPS and airborne-derived particulate matter stimulates IL-1{beta} secretion in cell culture. We hypothesized that air-borne particulate matter could co-stimulate, with MSUc, IL-1{beta} secretion and inflammation. We tested the hypothesis using MSUc with extracted airborne PM4 in human cells (the THP-1 monocyte cell line, primary human monocytes and PBMCs) or carbon black particles with ozone (CB+O3) in a murine foot-pad injection model of gout. There was strong NLRP3-inflammasome-dependent co-stimulation of IL-1{beta} secretion in THP-1 cells with PM4+MSUc and a moderate additive effect in primary human PBMCs. However, there was no added effect on IL-1{beta} secretion of PM4 in isolated primary human monocytes. Inhalation of CB+O3 persistently exacerbated MSUc-induced murine paw inflammation, with an increase of alveolar/lavage macrophages that contained CB+O3 particles and increased lavage expression of IL-1{beta}. In conclusion, airborne-derived PM4 particulate matter enhanced MSUc-induced IL-1{beta} secretion in THP-1 cells and PBMCs. Combined with exacerbation of MSUc-induced inflammation by fine particulate matter in in vivo experiments, these data provide evidence that exposure to fine particulate matter may play a role in the etiology of gout.

3
Genetically informed search for potential osteoarthritis drug targets across the proteome

Liu, W.; Zuckerman, B. P.; Schuermans, A.; Orozco, G.; Honigberg, M. C.; Bowes, J.; ONeill, T. W.; Zhao, S. S.

2026-02-11 rheumatology 10.64898/2026.02.10.26345885
#1
92× avg
Show abstract

BackgroundOsteoarthritis (OA) is a leading cause of disability worldwide, yet no licensed therapies can prevent or slow its progression. We aimed to identify potential targets for disease-modifying OA drugs (DMOADs) by integrating genetic and differential protein expression (DPE) evidence. MethodsWe evaluated genetically predicted perturbations of plasma protein levels using cis-protein quantitative trait loci (cis-pQTLs) across three large European cohorts (UK Biobank Pharma Proteomics Project, deCODE, and Fenland) and outcome data from the Genetics of Osteoarthritis Consortium, covering 11 OA phenotypes. DPE analyses were performed in 44,789 UKB participants, comparing 2,920 protein measurements between OA cases and controls, supported by sensitivity analyses. Proteins identified through genetic and/or DPE approaches were further assessed in downstream analyses. FindingsIn total, 305 proteins showed evidence of association with OA through genetically predicted perturbations, with 81 supported by colocalisation across datasets. DPE analyses identified 605 proteins associated with at least one OA phenotype, of which 450 (74{middle dot}4%) remained robust after sensitivity testing. Several novel targets were identified, including PPP1R9B, PCSK7, and ITIH4. Integration of both approaches prioritised 5 proteins, 4 of which demonstrated druggable potential, including 3 high-confidence candidates DLK1, TNFRSF9, and OGN. Downstream analyses highlighted key biological pathways and candidate compounds with potential for repurposing. InterpretationThis large-scale study combines genetic and DPE evidence to prioritise candidate DMOAD targets. Findings reinforce established biology while revealing novel proteins and pathways, providing a foundation for therapeutic development in OA. FundingWL is supported by the Guangzhou Elite Project (project no. JY202314). SSZ is supported by The University of Manchester Deans Prize, Arthritis UK Career Development Fellowship (grant no. 23258). This work is supported by the NIHR Manchester Biomedical Research Centre (NIHR203308). Research in contextO_ST_ABSEvidence before this studyC_ST_ABSCirculating proteins have been linked to osteoarthritis (OA) in observational studies, supporting their potential as biomarkers and drug targets. However, differential protein expression analyses are vulnerable to confounding and reverse causation. Mendelian randomisation (MR) studies using proteomic GWAS instruments have suggested causal roles for several circulating proteins in OA-related traits and highlighted druggable candidates. However, many analyses relied on earlier OA GWAS data (e.g., Genetics of Osteoarthritis Consortium 1{middle dot}0) and smaller proteomic GWAS datasets, and typically did not integrate MR findings with large-scale differential protein expression. As a result, it remains unclear how well genetically predicted protein effects align with observed protein expression in OA, and how robust prioritised targets are when replicated across proteomic data from multiple cohorts. Added value of this studyThis study integrates large-scale proteomic MR and differential protein expression (DPE) analyses across multiple OA phenotypes using the largest datasets to date. By combining genetic evidence with observed protein dysregulation in population-based cohorts, we strengthen causal inference and improve robustness of target prioritisation. This approach allows us to distinguish proteins that are likely to play a causal role in OA from those that reflect downstream disease processes, and to highlight targets with greater translational relevance than identified by either method alone. Implications of all the available evidenceTaken together, our findings support a causal role for a subset of circulating proteins in OA and demonstrates the value of integrating genetic and observational proteomic data for target prioritisation. Proteins supported by both MR and DPE are more likely to represent biologically relevant drivers of disease and actionable therapeutic targets. This integrated framework reduces false positives arising from confounding or reverse causation and provides a more reliable basis for drug development, biomarker discovery, and patient stratification in OA.

4
Benchmarking HLA genotyping from whole-genome sequencing across multiple sequencing technologies

Cremin, C.; Elavalli, S.; Paulin, L.; Arres Reche, J.; Saad, A. A. Y. A.; Attia, A.; Minas, C.; Aldhuhoori, F.; Katagi, G.; Wu, H.; Sidahmed, H.; Mafofo, J.; Soliman, O.; Behl, S.; Pariyachery, S.; Gupta, V.; Ghanem, D.; Sajjad, H.; Cardoso, T.; El-Khani, A.; Al Marzooqi, F.; Magalhaes, T.; Sedlazeck, F. J.; Quilez, J.

2026-02-12 health informatics 10.64898/2026.02.10.26345621
#1
64× avg
Show abstract

BackgroundThe hyperpolymorphic nature and structural complexity of the human leukocyte antigen (HLA) genomic region present challenges for accurate and scalable typing across diverse sample types. While wholegenome sequencing (WGS) offers the opportunity to infer HLA genotypes without targeted enrichment, systematic benchmarks across sequencing platforms, biospecimens and coverage levels remain limited. ResultsWe assembled a multi-platform resource of WGS datasets derived from short-read (Illumina, MGI) and long-read (Oxford Nanopore Technologies R9 and R10) sequencing, spanning 29 biospecimens including cell lines, blood, buccal swab and saliva. We evaluated the performance of the HLA caller HLA*LA across 13 HLA genes, using a clinically validated assay as reference. WGSbased HLA genotyping achieved [~]95% accuracy across sequencing platforms, with Class I loci exhibiting higher accuracy than Class II. Crossplatform concordance was high, and performance remained consistent across Illumina, MGI and Oxford Nanopore chemistries. Analysis of blood, buccal swab and saliva samples showed that blood and buccal swabs supported accurate HLA inference, whereas saliva yielded reduced concordance. Downsampling experiments demonstrated that 15x coverage was sufficient to retain >95% accuracy at twofield resolution, with lower depths supporting lower-resolution typing. ConclusionsOur results demonstrate that WGS provides a robust, platformagnostic framework for accurate HLA genotyping across sample types and coverage levels. These benchmarks establish practical conditions for reliable HLA inference and underscore the utility of WGS for populationscale HLA analyses and future clinical applications.

5
Longitudinal peripheral blood multi-omic profiling in seropositive individuals identifies immune endotypes and predictive models for future rheumatoid arthritis conversion

Inamo, J.; Bylinska, A.; Smith, M.; Vanderlinden, L.; Wright, C.; Stephens, T.; Feser, M. L.; Striebich, C. C.; O'Dell, J. R.; Sparks, J. A.; Davis, J. M.; Graf, J.; McMahon, M. A.; Solow, E. B.; Forbess, L. J.; Tiliakos, A. N.; Fox, D. A.; Danila, M. I.; Horowitz, D. L.; Kay, J.; James, J. A.; Holers, V. M.; Deane, K. D.; Guthridge, J. M.; Zhang, F.

2026-02-17 rheumatology 10.64898/2026.02.12.26346058
Top 0.1%
54× avg
Show abstract

Individuals who have serum elevations of anti-cyclic citrullinated protein (anti-CCP) antibodies are at risk for developing rheumatoid arthritis (RA), yet immunologic factors that lead to a transition from pre- to clinical RA remain unclear. Here, we used materials from anti-CCP antibody-positive individuals enrolled in a clinical trial that evaluated the efficacy of hydroxychloroquine to prevent clinical RA, and performed multi-modal single-cell profiling (transcriptome, surface proteins, T/B-cell receptor sequencing, and chromatin accessibility) on samples obtained at baseline and at RA onset in those who developed clinical RA (Converters) or follow-up point in matched Nonconverters. At both baseline and follow-up, Converters had expansions of peripheral helper T (Tph) cells and CD8+ T cells expressing GZMK and GZMB, along with elevated potentially autoreactive T-cell receptors in CD4+ T cells compared to Nonconverters. Induction of age-associated B cell signatures was observed in B cells of Converters prior to RA onset. Epigenetic profiling further identified chromatin accessibility changes in Converters over time, particularly within myeloid and NK cells. Lastly, predictive modeling using baseline immune features, including Tph cells, GZMK+XCL1+ CD8+, and GZMB+CD57+ CD8+ T cells, together with clinical features such as anti-CCP3 levels, RF-positivity, and HLA shared epitope status, stratified RA risk and predicted time to onset. These findings define immune endotypes in pre-RA that could serve as targets for future preventive interventions and be used to stratify the risk of developing clinical RA in anti-CCP antibody-positive individuals.

6
A Common Missense Variant, W335S, in β2-Glycoprotein I (APOH) is Associated with Increased Autoantibody Levels but Reduced Venous Thromboembolism Risk

Lalaurie, C.; Liu, L.; Khan, A.; Wang, C.; Rich, S.; Barr, R. G.; Bernstein, E.; Kiryluk, K.; McDonnell, T. C. R.; Luo, Y.

2026-03-05 rheumatology 10.64898/2026.03.04.26347632
Top 0.1%
53× avg
Show abstract

Anti-{beta}2-glycoprotein I (anti-{beta}2GPI) antibodies are central to the pathogenesis of antiphospholipid syndrome (APS), an autoimmune disease characterized by a strong predisposition to venous thromboembolism (VTE). In this study, we conducted a multi-ancestry genome-wide association study (GWAS) of quantitative total anti-{beta}2GPI levels in 5,969 participants enrolled in the Multi-Ethnic Study of Atherosclerosis (MESA) and identified a genome-wide significant association at the APOH locus. Paradoxically, genetically determined increases in anti-{beta}2GPI levels at this locus were associated with lower VTE risk. Fine-mapping and functional genomics prioritized the missense variant rs1801690 (W335S) in {beta}2GPI (apolipoprotein H, [APOH]) as the most likely causal variant. This variant has an allele frequency of 5-6% in European and East Asian ancestries but only 1% in African ancestries. Integrating prior experimental studies, molecular dynamics simulations and structure-based epitope prediction, we propose a dual-effect mechanism whereby W335S reduces thrombotic risk by disrupting phospholipid binding in Domain V, yet increases autoantibody production through conformational changes that enhance epitope exposure in Domains I and II. These findings mechanistically uncouple autoantibody formation from thrombotic risk in carriers of the W335S variant, and suggest that APOH genotype may represent a clinically relevant genetic biomarker with potential utility for thrombotic risk stratification in anti-{beta}2GPI-positive individuals.

7
FA-NIVA: A Nextflow framework for automated analysis of Nanopore based long-read sequencing data for genetic analysis in Fanconi anemia

Neurgaonkar, P.; Dierolf, M.; O'Gorman, L.; Remmele, C.; Schaeffer, J.; Popp, I.; Borst, A.; Rost, S.; Ankenbrand, M.; Kratz, C.; Bergmann, A.; Kalb, R.; Yu, J.

2026-03-04 genetic and genomic medicine 10.64898/2026.02.27.26346867
Top 0.3%
45× avg
Show abstract

MotivationFanconi anemia (FA) is a rare disease mainly caused by biallelic pathogenic variants, including structural variants such as large deletions and insertions in FA genes. Currently, variant detection is based on short-read sequencing and probe-based approaches. However, determining the exact genomic breakpoint or achieving allelic discrimination remains challenging. Nanopore-based long-read sequencing enables a comprehensive detection of FA variants, but a unified bioinformatic analysis platform for these data is missing. ResultsWe present FA-NIVA (Fanconi anemia - Nanopore Indel and Variant Analysis), an automated and adaptable analysis workflow tailored for Nanopore-based long-read sequencing data in FA genetic analysis. FA-NIVA integrates state-of-the-art tools to comprehensively detect both single nucleotide variants (SNVs) and structural variants (SVs). Our analysis platform enhances genotyping accuracy for biallelic variants by a joint SNV-SV based phasing in FA associated genes. Built within the Nextflow ecosystem and powered by containerized Docker images, FA-NIVA ensures reproducibility, flexibility, scalability and transparency across different computing environments. Together, FA-NIVA provides a robust end-to-end solution for the automated analysis of SVs and SNVs and high-resolution phasing analysis in FA genes, enabling an accurate and efficient pipeline for genetic analysis. AvailabilityFA-NIVA is available on GitHub at: https://github.com/UKWgenommedizin/FA-NIVA.

8
A network-based atlas of human skeletal muscle aging

Stokes, T.; Lim, C.; Ali, M.; Mcleod, J. C.; Gisby, J.; Mariniello, K.; Crossland, H.; Sharif, J.; Deane, C.; Moseley, T. C.; Ismail, N. M.; Lixandrao, M. E.; Volmar, C.-H.; McCormick, P. J.; Brogan, R. J.; Whiteford, J.; Roschel, H.; Phillips, S.; Gallagher, I. J.; Slabaugh, G.; Phillips, B. E.; Kraus, W. E.; Atherton, P. J.; Chapple, J. P.; Timmons, J. A.

2026-02-17 genetic and genomic medicine 10.64898/2026.02.15.26346348
Top 0.7%
27× avg
Show abstract

Skeletal muscle metabolic and physical capacities are influenced by both genetics and load status and decline with age. Recent advances in sequencing have detailed cell types at unprecedented detail; yet these approaches do not scale to adequately model human muscle physiological heterogeneity. We produced a powerful resource for ageing studies, including consistent deep transcriptomic profiles of 1,675 human muscle biopsies ([~]28,000 genes per profile) and multiple single-cell spatial transcriptomic technologies. We present several novel models of tissue ageing. Five Quantitative network models (QNMs), built using >40 trillion calculations and 930 human muscle transcriptomes, modelled aging and the influence of load status. Additional differential expression (DE) signatures for atrophy, hypertrophy and cardio-respiratory adaptation were integrated with single-cell RNAseq and cell-specific bulk profiles to reveal cell-enriched modules and the topology of human skeletal aging. Rapamycin transcriptomes from cultured muscle and endothelial cells, along with in vivo signatures for insulin resistance and sex, were integrated into these analyses. We show that >3,000 genes are DE with muscle age (equally up and down); that a novel pre-frailty signature in elderly subjects has a remarkably strong overlap with the response of healthy muscle during experimental atrophy and that the hypertrophy signature in elderly muscle, but not young muscle, opposes the age-regulated transcriptome. We report that non-responders for hypertrophy or gains in cardio-respiratory capacity have highly distinct genome-level response to exercise. QNM revealed cell-specific processes in endothelial cells and fibroblasts, including novel interactions between insulin sensitivity, age and senescence. From two hundred and eighty-six hub genes consistent in both young and old muscle network models, 27% had known roles in muscle biology, while of the top 50 hub genes (45% protein coding), 80% were newly linked to human muscle biology, including ARHGAP4, CEP131 and IFITM10 and many short- and long-noncoding RNAs. Many genes demonstrated extreme changes in topology in old muscle, such as the neddylation and aging linked gene, DCUN1D5. GeoMX-based spatial muscle fibre-type profiling (57 regions), along with Xenium (8 regions) and Merscope (54 regions) single-cell spatial technologies located key aging, frailty and load-responsive genes to individual cell types and provided novel insight into the location of autocrine/paracrine secreted factors such as GDNF, while IL6 was located to rare endothelial cells. A machine-learning model ranked the factors most associated with the topological changes with age. This prioritised network features over DE signatures, highlighting positive correlating edges to down-regulated genes during atrophy, genes up-regulated by Rapamycin and both positive and negative correlating insulin sensitivity features, along with gene hub status, best explained muscle ageing. Genome level modelling produced an independently validated transcriptomic age clock and found it to be invariant to muscle load status in people >50y, while we revealed novel interactions between gene length and age. Release of an unprecedented level of consistently aligned genomic data, along with QNMs with >7,000 searchable modules, provides a powerful resource for the aging research communities.

9
An Integrated Deep Learning Framework for Small-Sample Biomedical Data Classification: Explainable Graph Neural Networks with Data Augmentation for RNA sequencing Dataset

Guler, F.; Goksuluk, D.; Xu, M.; Choudhary, G.; agraz, m.

2026-02-24 genetic and genomic medicine 10.64898/2026.02.22.26346827
Top 0.7%
27× avg
Show abstract

Applying deep learning models to RNA-Seq data poses substantial challenges, primarily due to the high dimensionality of the data and the limited sample sizes. To address these issues, this study introduces an advanced deep learning pipeline that integrates feature engineering with data augmentation. The engineering application focuses on biomedical engineering, specifically the classification of RNA-Seq datasets for disease diagnosis. The proposed framework was initially validated on synthetic datasets generated from Naive Bayes, where MLP-based augmentation yielded a notable improvement in predictive performance. Building on this foundation, we applied the approach to chromophobe renal cell carcinoma (KICH) RNA-Seq data from The Cancer Genome Atlas (TCGA). Following standard preprocessing steps normalization, transformation, and dimensionality reduction, the analysis concentrated on three main aspects: augmentation strategies, preprocessing methods, and explainable AI (XAI) techniques in relation to classification outcomes. Feature selection was performed through PCA, Boruta, and RF-based methods. Three augmentation strategies linear interpolation, SMOTE, and MixUp were evaluated. To maintain methodological rigor, augmentation was applied exclusively to the training set, while the test set was held out for unbiased evaluation. Within this framework, we conducted a comparative assessment of multiple deep learning architectures, including MLP, GNN, and the recently proposed Kolmogorov-Arnold networks (KAN). The GNN achieved the highest classification accuracy (99.47%) when trained with MixUp augmentation combined with RF feature selection, and achieved the best F1 score (0.9948). Consequently, the GNN-based XAI framework was applied to the RF dataset enriched with MixUp. XAI analyses identified the top 20 most influential genes, such as HNF4A, DACH2, MAPK15, and NAT2, which played the greatest role in classification, thereby confirming the biological plausibility of the model outputs. To further validate model robustness, cervical cancer and Alzheimers RNA-Seq datasets were also tested, yielding consistent and reliable results. Overall, the findings highlight the value of incorporating data augmentation into deep learning models for RNA-Seq analysis, not only to improve predictive performance but also to enhance biological interpretability through explainable AI approaches.

10
Integrative screening identifies functional variants and VNTRs underlying GWAS signals at the 5p15.33 multi-cancer susceptibility locus

O'Brien, A.; Kong, H.; Patel, H.; Ho, M.; Patel, M. B.; Zhong, J.; Xu, M.; Papenberg, B. W.; Connelly, K. E.; Collins, I.; Hennessey, R.; Thakur, R.; Sowards, H.; Funderburk, K.; Luong, T.; Florez-Vargas, O.; Myers, T.; Jermusyk, A.; Gorman, B.; Luo, W.; Jones, K.; Das, S.; Lan, Q.; Rothman, N.; McKay, J. D.; Hung, R. J.; Amos, C. I.; Iles, M. M.; Koutros, S.; Landi, M. T.; Law, M. H.; Stolzenberg-Solomon, R. Z.; Wolpin, B.; Hassan, M.; Klein, A. P.; Antwi, S. O.; Orr, N.; Chanock, S. J.; Lindstroem, S.; Hoskins, J. W.; Stern, M.-H.; Andresson, T.; Shi, J.; Prokunina-Olsson, L.; Choi, J.; Brow

2026-03-04 genetic and genomic medicine 10.64898/2026.03.03.26347427
Top 0.7%
27× avg
Show abstract

Chromosome 5p15.33 harbors several independent association signals which demonstrate antagonistic pleiotropy across cancer types, with causal mechanisms largely unresolved. To identify functional variants and enhancer elements at this locus, we performed statistical fine-mapping followed by massively parallel reporter assays (MPRA) and proliferation based CRISPRi screens. This approach identified eight multi-cancer functional variants (MCFVs) across three GWAS signals. Targeting rs421629 (part of the CLPTM1L signal marked by rs465498) with CRISPRi revealed opposing effects on TERT expression in pancreatic versus lung cancer cells, consistent with the antagonistic pleiotropy observed for this signal. Furthermore, CRISPRi nominated an intronic CLPTM1L variable number tandem repeat (VNTR) as a potent enhancer. Long-read sequencing established VNTR polymorphisms as potential causal variants for the rs465498 signal. We showed that Hippo-pathway transcription factors mediate VNTR enhancer activity in lung and pancreatic cancer cells. Together, these findings indicate that cancer susceptibility at 5p15.33 may be mediated by both SNPs and VNTRs and provide an integrated framework for resolving complex pleiotropic loci.

11
Massively parallel functional profiling identifies CCDC88C as a risk gene for ER-positive breast cancer

Mackie, K.; Kemp, H.; Gunnell, A.; Studd, J. B.; Went, M.; Law, P.; Tomczyk, K.; Sevgi, S.; Lu, Y.; Orr, N.; Houlston, R. S.; Johnson, N.; Fletcher, O.; Haider, S.

2026-03-03 genetic and genomic medicine 10.64898/2026.03.02.26347419
Top 0.7%
27× avg
Show abstract

Genome wide association studies (GWAS), combined with fine-mapping have identified 196 independent signals associated with breast cancer risk. Deciphering the functional basis of these associations can inform our understanding of the biology and aetiology of breast cancer. Decoding GWAS risk associations is challenging due to linkage disequilibrium between variants and because most variants map to non-coding regions, influencing breast cancer risk via cis-regulatory mechanisms that modulate the expression of target genes. To identify the functional variants driving breast cancer risk associations, we carried out a lentivirus-based massively parallel reporter assay (lentiMPRA) to screen 5,116 credible causal variants across these signals. We identified 709 variants mapping to 140 risk regions, that are associated with significant variation between REF and ALT alleles. A follow-up investigation at 14q32.11 revealed rs7153397 may impact expression of CCDC88C to influence both breast cancer risk and prognosis. These findings provide a prioritised set of functional variants for downstream analyses, advancing our understanding of breast cancer risk mechanisms.

12
Spatial transcriptomics reveals mechanism of autoimmunity driven by internalized autoantibodies

Pinal-Fernandez, I.; Pak, K.; Casal-Dominguez, M.; Munoz-Braceras, S.; Wigerblad, G.; Dell'Orso, S.; Naz, F.; Islam, S.; Gutierrez-Cruz, G.; Kinder, T. B.; Ogbonnaya-Whittlesey, S. A.; Fernandez-Codina, A.; Giannini, M.; Ellezam, B.; Laverny, G.; Gilbart, V.; Landon-Cardinal, O.; Hudson, M.; Troyanov, Y.; Randazzo, D.; Kenea, A.; Matas-Garcia, A.; Garrabou, G.; Aldecoa, I.; Ailen-Caballero, G.; Gil-Vila, A.; Trallero, E.; Milone, M.; Liewluck, T.; Naddaf, E.; Espinosa, G.; Simeon-Aznar, C. P.; Guillen-Del-Castillo, A.; Preusse, C.; Kleefeld, F.; Bublitz, N.; Stenzel, W.; Meyer, A.; Pope, J. E.

2026-02-17 rheumatology 10.64898/2026.02.14.26346329
Top 0.8%
26× avg
Show abstract

Autoantibody internalization has been implicated in autoimmune disease pathogenesis, yet its mechanisms, and generality across different diseases, cell types, and affected tissues remain poorly defined. Using bulk RNA sequencing, we identified reproducible, autoantibody-specific transcriptomic signatures consistent with autoantigen dysfunction in muscle biopsies from patients with anti-Mi2 dermatomyositis and anti-PM/Scl scleromyositis across independent cohorts. Electroporation of purified patient IgG into primary cultures of healthy cells was sufficient to induce the corresponding transcriptomic programs in vitro. Direct immunofluorescence demonstrated immunoglobulin internalization into subcellular compartments matching the localization of the autoantigen in different affected tissues. Spatial transcriptomic analyses revealed that antibody-secreting cells translocated cytoplasmic material (i.e., immunoglobulin RNA) into adjacent affected cells expressing autoantibody-specific transcripts. The disease-specific transcripts were present not only in muscle fibers, but also in other cells, including macrophages, endothelial cells, and fibroblasts. Autoantibody-induced transcriptomic programs were associated with cell damage and autoantibody-specific reactive inflammatory programs, including activation of type I interferon and TGF-{beta}1 signaling in anti-Mi2 dermatomyositis and activation of type II interferon in anti-PM/Scl scleromyositis. Antibody internalization was also observed in different tissues from patients with other autoimmune diseases, including anti-U1RNP mixed connective tissue disease, anti-Ku overlap syndrome, and anti-Scl70 systemic sclerosis. Together, these findings establish autoantibody internalization as a shared pathogenic mechanism across diverse autoimmune diseases, providing a unifying framework for conditions driven by autoantibodies against intracellular antigens.

13
Leveraging genome-wide effects on gene expression to identify disease-critical genes with trans-genetic components

Brunton, K.; Ragsac, M. F.; Amariuta, T.

2026-02-25 genetic and genomic medicine 10.64898/2026.02.23.26346922
Top 0.8%
26× avg
Show abstract

Genome-wide association studies (GWAS) have implicated tens of thousands of genetic variants associated with complex traits and polygenic diseases. Colocalizing GWAS variants with variants that may regulate gene expression, via expression quantitative trait loci (eQTL) mapping, has successfully led to the identification of disease-critical genes and their cell types of action. Recent studies predominantly colocalize proximal cis-eQTLs, which are estimated to regulate [~]10% of variance in gene expression levels. However, trans-eQTLs have been hypothesized to account for an additional [~]20% of expression levels, although few studies have attempted to quantify the variance explained by empirically associated trans-eQTLs. Here, we introduce EGRET (Estimating Genome-wide Regulatory Effects on the Transcriptome), an ensemble framework that jointly models cis-eQTLs with three distinct trans-eQTL mapping approaches: standard pairwise association testing via Matrix eQTL, and two functionally-informed methods, trans-PCO and GBAT. In real data, EGRET produced 353,408 predictive gene expression models (cross-validation R2 > 0, p < 0.01) across 49 GTEx tissues, including 12,317 gene-tissue pairs with a significantly nonzero trans-heritable component. For this set of genes, EGRET models explain 33% more gene expression variance than cis-eQTL models (EGRET average R2 = 0.104, FUSION average R2 = 0.078). We found that putative trans-regulating variants of EGRET models are enriched for regulatory elements such as enhancers, histone marks, and cis-eQTLs of other genes. We then hypothesized that EGRET models could nominate new disease-critical genes via a transcriptome-wide association study (TWAS) framework that models genome-wide regulatory effects on gene expression. In simulations of theoretically representative gene expression architectures ([~]30% heritability, where more than 70% is distal), EGRET increased the power to detect disease-critical genes by 1.2x-3.1x compared to cis-eQTL models. In real data analysis, we identified disease-associated genes via TWAS across GWAS summary statistics for 78 complex traits and polygenic diseases using gene expression prediction models from EGRET, cis-eQTL FUSION, and two state-of-the-art trans-eQTL TWAS methods, MOSTWAS and BGW-TWAS. EGRET identified 450,825 gene-disease associations that were not identified by FUSION models, 2,900 associations not identified by MOSTWAS, and 5,498 associations not identified by BGW-TWAS. Finally, we used EGRET models to construct gene regulatory networks, some of which harbored genes that were jointly associated with complex traits. For example, the gene members of the network defined by ARHGEF3, whose cis-regulatory variants help predict expression of 10 genes in trans, were concordantly associated with platelet count using EGRET but not FUSION models. Overall, we find that modeling the genome-wide genetic component of gene expression greatly boosts the detection of disease-critical genes and helps define gene regulatory networks while improving the characterization of GWAS variants.

14
Characterization of the somatic landscape and transcriptional profile of breast tumors from 748 Hispanic/Latina women in California

Ding, Y.; Sayaman, R. W.; Wolf, D.; Mortimer, J.; Mao, A.; Fejerman, L.; Gruber, S. B.; Neuhausen, S. L.; Ziv, E.

2026-02-17 genetic and genomic medicine 10.64898/2026.02.13.26346286
Top 1.0%
25× avg
Show abstract

Somatic mutations and the tumor immune microenvironment in breast tumors are important predictors of treatment response and survival, yet data for Hispanic/Latina (H/L) women are limited. Here we analyzed whole exome sequencing data from tumor/normal pairs and RNAseq data from 748 H/L women and 388 non-Hispanic White (NHW) women. Overall, the somatic profiles in tumors from H/L women were similar to NHW women. However, somatic mutations in genome organizer CTCF were significantly more common in H/L women. We also found that tumor microenvironment immune ecotypes CE9 and CE10, characterized by increased lymphocyte infiltration and more favorable prognosis, were more common among women with higher Indigenous American ancestry. Finally, we found that a germline APOBEC3A/B copy-number deletion was more prevalent in H/L than in NHW and was associated with the COSMIC APOBEC mutational signatures and with CE10 ecotype. Overall, these results suggest that ancestry differences may provide insights into specific mutation and immune profiles.

15
Genetic liability to hip osteoarthritis confers neurovascular protection against Alzheimer's disease despite depression-mediated phenotypic comorbidity

Xu, Q.; Zhao, P.; Tao, J.; Zheng, H.

2026-03-04 genetic and genomic medicine 10.64898/2026.03.04.26347509
Top 1%
23× avg
Show abstract

BackgroundThe relationship between hip osteoarthritis (hip OA) and Alzheimers disease (AD) presents a critical paradox within the emerging "bone-brain axis": widespread phenotypic comorbidity sharply contradicts evolutionary theories of biological antagonism. This study integrates longitudinal and multi-omic analyses to determine whether this clinical overlap masks an underlying genetic neuroprotection. MethodsWe analyzed longitudinal phenotypic data from 261,767 UK Biobank participants using Cox proportional hazards and Fine-Gray competing risk models. To investigate the shared genetic architecture, we applied MiXeR modeling to genome-wide association study summary statistics. Causal relationships were evaluated using global and cell-type-stratified Mendelian randomization across eight distinct brain cell types. Shared genomic loci were identified via conjunctional/conditional false discovery rate and fine-mapping. Single-nucleus RNA-sequencing (snRNA-seq) data from the ROSMAP cohort validated the disease-associated transcriptional dynamics of prioritized target genes. ResultsObservational survival analyses initially suggested an increased AD risk in patients with hip OA; however, this association was fully attenuated after adjusting for a history of depression, revealing a "phenotypic illusion" driven by the pain-depression axis. Conversely, cell-type-stratified genomic analyses uncovered a profound biological antagonism: genetic liability for hip OA confers robust neuroprotection specifically localized to the neurovascular unit (NVU), primarily driven by astrocytes and pericytes. Mechanistically, this NVU fortification is orchestrated by the MAPT locus and PI3K/AKT signaling, with snRNA-seq confirming the active transcriptional remodeling of these core effectors in the AD brain. ConclusionWe demonstrate that genetic liability to hip OA confers robust neurovascular protection against AD, a profound biological antagonism that is clinically masked by depression-mediated phenotypic comorbidity. These findings propose an evolutionary trade-off model within the bone-brain axis, underscoring the urgency of active hip OA pain management to mitigate depressive symptoms and decelerate cognitive aging, while cautioning against the uncritical repurposing of anabolic inhibitors across these interconnected systems.

16
Familial medullary thyroid carcinoma secondary to an SLC30A9 intragenic deletion and translation reinitiation

Iacovazzo, D.; Begalli, F.; Suleyman, O.; Doleschall, M.; Alevizaki, M.; Ashelford, K. E.; Awad Mahmoud, S.; Barlier, A.; Barry, S.; Brain, C.; Cabrera, C. P.; Castinetti, F.; Chiloiro, S.; Colclough, K.; Csabi, A.; Druce, M. R.; Dutta, P.; Fatih, J. M.; Foulkes, W. D.; Gandhi, M.; Grochowski, C. M.; Hall, C. L.; Jarzab, B.; Klein, K. O.; Krajewska, J.; Kurzawinski, T. R.; Lamers, S.; Lugli, F.; Magid, K.; Margraf, R.; Martin, C. S.; Mathiesen, J. S.; Mihai, R.; Morrison, P. J.; Mozere, M.; Oczko-Wojciechowska, M.; Owens, M.; Ozretic, L.; Patocs, A.; Piacentini, S.; Punetha, J.; Romanet, P.; S

2026-02-27 genetic and genomic medicine 10.64898/2026.02.26.26346165
Top 1%
23× avg
Show abstract

While most individuals with familial medullary thyroid carcinoma (fMTC) carry RET mutations, in some instances the causative mutations remain unknown. We studied two related families with RET-negative fMTC in 21 affected individuals through linkage analysis, exome/genome sequencing, and high-density array comparative genomic hybridization. We identified a novel heterozygous 40kb intragenic SLC30A9 deletion which segregated with the disease in all affected individuals. The mutant transcript escaped nonsense-mediated decay and resulted in the production of N-terminally truncated proteins via translation reinitiation from in-frame AUG codons located downstream of the deletion. These proteins showed increased stability and their expression in an MTC cell line increased cell proliferation and clonogenic capacity, supporting an oncogenic role. These findings expand the genetic background of fMTC beyond RET mutations and implicate translation reinitiation in the etiology of cancer susceptibility syndromes secondary to structural genomic variants.

17
Clinical, in vitro, and in vivo evidence of WAPL as a novel cohesinopathy gene and phenotypic driver of 10q22.3q23.2 genomic disorder

Boone, P. M.; Erdin, S.; Mohamed, A.; Haghshenas, S.; Faour, K. N. W.; Kao, E.; Fu, J.; Auwerx, C.; Harripaul, R.; Jana, B.; Springer, D.; Hallstrom, G.; de Esch, C. E. F.; Denhoff, E.; Holmes, L.; Mohajeri, K.; Lemanski, J.; Kerkhof, J.; McConkey, H.; Rzasa, J.; McCune, M. J.; Levy, M. A.; Grafstein, J.; Larson, M.; Wright, Z.; Beauchamp, R. L.; Lucente, D.; Abou Jamra, R.; Agrawal, N.; Agrawal, P. B.; Andersen, E. F.; Argilli, E.; Araiza, R.; Ballal, S.; Baxter, M. F.; Bergant, G.; Bertsche, A.; Bhavsar, R.; Bortola, D. R.; Bothe, V.; Brasch-Andersen, C.; Braun, D.; Bruel, A.-L.; Buchanan, C

2026-02-28 genetic and genomic medicine 10.64898/2026.02.23.26346364
Top 1%
18× avg
Show abstract

Cohesin is a fundamental genome-organizing complex that orchestrates three-dimensional chromosome folding and gene expression via DNA loop extrusion. Alterations to genes encoding cohesin subunits and cohesin loaders cause Mendelian disorders, including Cornelia de Lange syndrome (CdLS). By contrast, disruption of factors that remove cohesin from DNA, including WAPL and its binding partners PDS5A and PDS5B, have not yet been associated with human disease. Here, we explored the relevance of these cohesin release factors in Mendelian disease by establishing a rare disease cohort of deeply phenotyped individuals with heterozygous, predicted damaging variants in WAPL (n=27), PDS5A (n=8), and PDS5B (n=8), by modeling WAPL deficiency in human cell lines and mice, and by aggregating rare disease association statistics from consortia studies. We identified a WAPL-related disorder characterized by developmental delay, intellectual disability, and risk of other developmental anomalies including clubfoot. Similarities between individuals with damaging WAPL variants and those with large, recurrent 10q22.3q23.2 (10q) deletions (which encompass WAPL) nominate WAPL as a driver gene within this genomic disorder region. While carriers of PDS5A or PDS5B variants exhibited features of developmental disorders, neither cohort-based statistics nor case phenotyping associated these genes with specific phenotypes. We used CRISPR engineering to generate truncating variants in WAPL, as well the 7.8 Mb 10q deletion or duplication in human iPSCs and induced neurons. Transcriptomic analyses identified differentially expressed genes in both models, with highly significant overlap between WAPL haploinsufficiency and 10q deletion signatures. Mice with 50% residual Wapl expression exhibited mild deficits of growth and learning/memory, whereas those with 25% residual Wapl expression displayed birth defects and postnatal lethality, revealing a dosage liability threshold below the level of heterozygosity. In summary, we delineated a novel genetic condition caused by cohesin release factor deficiency, nominated WAPL as a driver gene within a genomic disorder region, and further illuminated dosage sensitivity of human cohesin.

18
Decoding Pathogenic and Resilient Gene Regulatory Interactions in Alzheimer's Disease

Spencer, C.; PsychAD Consortium, ; N.M., P.; Hong, A.; Casey, C.; Shao, Z.; Alvia, M.; Argyriou, S.; Katsel, P.; Auluck, P. K.; Barnes, L. L.; Marenco, S.; Bennett, D. A.; Girdhar, K.; Voloudakis, G.; Haroutunian, V.; Bendl, J.; Hoffman, G. E.; Fullard, J. F.; Lee, D.; Roussos, P.

2026-02-26 genetic and genomic medicine 10.64898/2026.02.19.26346666
Top 1%
16× avg
Show abstract

The molecular basis of cognitive resilience in Alzheimers disease (AD), wherein individuals harbor substantial neuropathology yet maintain cognition, remains poorly understood. To systematically decode the regulatory logic underlying divergent cognitive outcomes, we constructed the largest cell-type-resolved gene regulatory network (GRN) atlas of AD to date, profiling 1.7 million nuclei from 687 individuals classified as Controls, cognitively Resilient, or AD dementia across 27 cell types in the human dorsolateral prefrontal cortex. From 223 high-confidence transcription factor regulons, we identify a three-state framework of transcriptional dysregulation: homeostatic erosion of IRF8/STAT1 interferon programs in microglia (State I), compensatory NF-{kappa}B suppression via BCL6 in glial populations that distinguishes resilient from demented individuals despite equivalent neuropathological burden (State II), and pathogenic escalation through FLI1/IKZF1 network expansion driving vascular-immune remodeling in AD (State III). NF-{kappa}B emerges as the central regulatory hub, with BCL6-mediated repression and FLI1/RELA-driven activation constituting opposing molecular switches that determine cognitive trajectory. These findings, replicated across independent cohorts, reframe resilience as an active regulatory state rather than attenuated disease, and nominate BCL6, IRF8, and FLI1 as priority targets for interventions aimed at extending the compensatory window before dementia onset.

19
Misclassification of heritable mortality undermines estimates of intrinsic life span heritability

Hamilton, F. W.

2026-02-27 genetic and genomic medicine 10.64898/2026.02.26.26347172
Top 1%
16× avg
Show abstract

In a recent article in Science, Shenhar et al. report that human life span heritability reaches [~]55% after removing "extrinsic" mortality, roughly seven-fold higher than recent large pedigree estimates. This conclusion rests on classifying deaths from infections and accidents as environmental noise independent of genetics. This premise is biologically untenable: susceptibility to severe infection is substantially heritable, with adoptee studies showing relative risks exceeding 5 for infection death when a biological parent died of infection. By encoding the assumption that extrinsic mortality is non-genetic directly into their Gompertz-Makeham model, removing it necessarily inflates heritability estimates. This creates selection bias rather than correcting for confounding and explains the contradiction with both pedigree studies and GWAS findings. The proposed heritability estimate is therefore not the true heritability of any population, past or present.

20
Short tandem repeats significantly contribute to the genetic architecture of metabolic and sensory age-related hearing loss phenotypes

Ahmed, S.; Vaden, K. I.; Dubno, J. R.; Wright, G.; Drogemoller, B.

2026-02-18 genetic and genomic medicine 10.64898/2026.02.17.26346449
Top 1%
16× avg
Show abstract

Age-related hearing loss (ARHL) is a progressive, bilateral decline in hearing ability that affects one in four individuals over 60 years of age worldwide. While previous genome-wide association studies (GWAS) have identified distinct single-nucleotide variants (SNVs) associated with metabolic and sensory ARHL phenotypes, the contribution of short tandem repeats (STRs) - a neglected yet important class of genetic variants - remains poorly understood. To address this gap, TRTools was used to impute STRs from a high quality, sequencing-derived SNV-STR reference panel to investigate the association between STRs and metabolic and sensory estimates. Heritability analyses revealed that while STRs contribute to estimates of both ARHL components, this class of variation plays a more important role in metabolic hearing loss (6%), which typically increases with age, compared to sensory hearing loss (4%). Further, the inclusion of this class of variant into GWAS analyses uncovered an association between a haplotype consisting of two missense variants (rs7714670 and rs6453022) and an intronic STR (chr5:73778077:A16) in ARHGEF28 (P=3.30x10-9), proving further insight into the variants driving this previously identified signal. Notably, burden analyses revealed that rare and longer repeats were associated with an increased risk of the metabolic phenotype and a reduced risk of the sensory phenotype. Functional annotation of significant and nominally significant STRs revealed potential effects on gene expression and splicing of nearby genes. Our findings provide the first evidence that STRs explain some of the missing heritability of ARHL phenotypes and create an STR resource for researchers to use in future analyses.